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ABSTRACT 

Bispectrum phase, closure phase and their generahsation to kernel-phase are all in- 
dependent of pupil-plane phase errors to first-order. This property, when used with 
Sparse Aperture Masking (SAM) behind adaptive optics, has been used recently in 
high-contrast observations at or inside the formal diffraction limit of large telescopes. 
Finding the limitations to these techniques requires an understanding of spatial and 
temporal third-order phase effects, as well as effects such as time-variable dispersion 
when coupled with the non-zero bandwidths in real observations. In this paper, formu- 
lae describing many of these errors are developed and compared to the fundamental 
noise processes of photon- and background-noise. I show that the current generation 
of aperture-masking observations taken in good observing conditions are consistent 
with being limited by temporal phase noise and photon noise, which has relevance 
for plans to combine pupil-remapping with spatial filtering. Finally, I describe cal- 
ibration strategies for kernel-phase, including the optimised calibrator weighting as 
used for LkCal5, and the restricted kernel-phase POISE technique that avoids explicit 
dependence on calibrators. 

Key words: techniques: interferometric, instrumentation: adaptive optics, instru- 
mentation: high angular resolution 



1 INTRODUCTION 

The concepts of closure -phase, bispectrum phase (e.g. 
iHoftnann fc Wcigclt" 19931) . self-calibration and now kernel- 
phase (|Martinacha.2010l ') are well-known as techniques that 
cancel out many instrumental effects due to pupil-plane 
phase errors. Despite the very long history of a perture- 
mask i ng with a focus on fringe visibility am plitude (|FizearJ 
1 18681 : lMichelson|[l89ll : ISchwarzschiidlll896l '). it was the use 
of closure-phase that first enabled imag e-reconstruction 
from this technique (|Baldwin et al.l Il986|) as well as re- 
ccnt efforts in high-contrast imaging (e.g. lblovd et al.ll2006l : 
[Kraus & Ireland 2012). 

A simple explanation of closure-phase comes from a 
counting argument. From an interferometer with M (sub)- 
apertures, the complex visibilities can be independently 
measured on each of the M{M — l)/2 baselines consisting of 
each pair of (sub)-apertures. An optical aberration consist- 
ing of a piston on each of the (sub)-apertures amounts to 
M — 1 degrees of freedom in the phase differences, leaving 
(M— 1)(M — 2)/2 additional measured quantities, which are 
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the linearly-independent set of closure-phases. A set of ob- 
servables which are independent of pupil-plane phase form 
an ideal starting point for precise model-fitting and imaging 
at the diffraction-limit. This argument applies to both re- 
dundant and non-redundant pupil geometries, as realised by 
^Martinacho (2010). But if phase errors on a pupil are large, a 
redundant pupil configuration is at a disadvantage, because 
the pairs of pupil locations that form any given Fourier com- 
ponent may add out-of-phase and destructively interfere. In 
the case of observations taken behind adaptive optics, the 
choice of one technique over the other is not obvious. 

In this paper, I will outline the causes of contrast lim- 
itations in the aperture-masking interferometry and kernel- 
phase techniques, and means to maximise contrast. In Sec- 
tion [5] the main causes of kernel-phase errors will be out- 
lined. In Section [3] I will describe why the statistical corre- 
lations between closure-phases mean that kernel-phases are 
preferred as a primary observable, and will compare the con- 
trast limits achievable by different pupil geometries. In Sec- 
tion 14.11 1 will describe standard closure-phase calibration 
and its limitations, in Sec tion | 4.2I I will de s cribe the cali- 
bration strategy as used in lKraus fc Ireland! (|2012l ) to max- 
imise contrast in aperture-masking interferometry observa- 
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tions, and in Section [4.31 1 will describe the simpler POISE 
calibration strategy. In Sectional will conclude and outline 
the key areas where further research is needed. 



1.1 Kernel-Phase 

The definition of Kernel-phase as used in this paper will b e 
slightly simplified from the definition of Martinache! (|201(j ). 
as we will avoid the use of the "redundancy" matrix R. To 
first-order in pupil-plane phase (i.e. with a nearly-flat wave- 
front), we can write the observed phase "I>„i in the Fourier 
transform of an image as: 




(1) 



where ip is the pupil-plane phase and <l?o is the phase of 
the Fourier transform of the object. These are represented as 
vectors where each vector element is one discrete point in the 
model pupil plane or the image discrete Fourier transform. 
Using singular value decomposition, we then find a matrix 
K, the Kernel of A such that K A — 0. This matrix enables 
us to project the Fourier phases onto a subspace, which we 
will call the Kernel-phases 6 hy — K-$. On this subspace, 
the observables are not affected by pupil-plane phase errors 
at first-order: 



Figure 1. An abstract representation of closure-phases formed 
by baselines 1, 2 and 3, in turn formed by congruent apertures A, 
B and C. 



Vl ^ 1 + i((pB — fA) 



^Af, (4) 



with similar expressions for V2 and V3. The bispectrum 
is given by the product of these three visibilities, which can 
be again expanded to third-order in phase: 



hABC = V1V2V3 



(5) 



{K-A)-ip + K-$o 



(2) 



A model of the object can therefore be directly com- 
pared to the observed Kernel-phases by computing the 
Fourier transform and multiplying by the matrix K. 



2 CAUSES OF KERNEL-PHASE ERRORS 

2.1 General Pupil-Plane Phase Errors 

We will consider the following abstract representation of 
a closing triangle containing apertures A, B and C. Each 
aperture has the same size and shape, and each baseline 
1 = A-^B,2 = B^C and 3 = C A has data taken 
at the same time. That is, there is a common coordinate 
system describing apertures A, B and C. This means that 
the visibility on each baseline is formed by the incoherent 
integral of visibilities arising from common spatio-temporal 
coordinates in sub-apertures A, B and C. 

We will assign the symbols ipA, and ipc to the phase 
in sub-apertures A, B and C, the symbols $1, $2 and $3 to 
the phase on baselines 1, 2 and 3 respectively. The complex 
visibilities are then formed by: 





— expi{ipB 


- fA) 


V2 


— expi{ipc 


- fa) 




= expi{ipA 





(3) 

where the bar represents an average over the spatio- 
temporal co-ordinates corresponding to each aperture. This 
can be expanded to third-order in phase to: 



5R(foAsc) 

^{bABc) 



i-^ii^'B-^'Ay + i^'c-^By 



(6) 



'Q[{v'B-v>'Ar + {v'c-^'Br 



+ iv'A-v'cn (7) 

where we have considerably simplified the expansion by 
introducing the piston-corrected phases: 

'p'A = 'fiA-WA (8) 
f's = fB -'tfiB (9) 
= 'fiB — Wb (10) 

A more complete derivation of this expansion is given in 
Appendix A. The closure phase (/!>cp = $1 + $2 + $3 is then 
most simply approximated by taking the leading terms in 
the real (0th order) and imaginary (3rd order) components 
of the bispectrum, giving 0cp = 3(&abc)- 

It is also worthwhile briefly considering the effects 
of averaging the visibilities for baselines 1, 2 and 3 over 
different spaces. This could be caused by differing sub- 
aperture shapes in conventional aperture-masking interfer- 
ometry (amounting to non-closing triangles), or by disjoint 
integration times in other forms of interferometry. In this 
case, the leading terms in the closure-phase errors become 
first-order rather than third order in pupil plane phase. 
Clearly, this is something to be avoided at considerable ef- 
fort in the case of high-contrast aperture-masking. The pupil 
"shape" can also be thought of as the pupil-plane amplitude 
within each sub-aperture. Where amplitude errors are taken 
into account, these closure-phase errors then become second- 
order, i.e. first-order in phase and first-order in amplitude, 
and could plausibly be the leading term. 
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2.2 Temporal Phase Errors 

Our first application of Equation[7]to closure-phase errors is 
temporal effects. There are two key regimes that temporal 
errors operate in behind an AO system. Either exposure- 
times are comparable to or shorter than the inverse of the 
AO system bandwidth (the short-exposure regime) or ex- 
posure times are significantly longer than these timescales 
(the long-exposure regime). Given typical coherence times 
at ~2.2 microns or shorter wavelengths of <50ms, and typi- 
cal AO system bandwidths in the range 10-100 Hz, exposure 
times longer than ~ 100 ms in the near-infrared are in the 
long-exposure regime. 

In the long-exposure regime, we can make the approxi- 
mation that piston noise is white up to some cutoff frequency 
fc- This is not very unrealistic, because in the frozen tur- 
bulence approximation, the atmosphere has an amplitude 
spectrum proportional to f~^^^, while the error signal from 
a Proportional-Integral-Differential (PID) controller in the 
mid-frequency range where the proportional term dominates 
gives residual errors proportional to the input signal ampli- 
tude multiplied by the frequency /. This gives a resultant 
error amplitude proportional to f^^^ , up to the servo loop 
cutoff. At this cutoff, three independent phenomena all tend 
to cut-off the error spectrum rapidly: the f~^^^ atmospheric 
amplitude spectrum, the rapidly lowering gain of the servo 
approaching its Nyquist sampling frequency, and effects of 
spatial filtering. 

We will now make a second set of approximations by as- 
suming that the phase piston on each sub-aperture making 
a closing-triangle is un-correlated and has identical phase 
noise Uip. This may not be reasonable for some AO systems 
(e.g. if tip /tilt errors dominate due to tip /tilt mirror band- 
width) but as this depends on reconstructor and wavefront 
sensor details, it is a good first approximation. 

An exposure of total time T can then be split into fcT 
sub-exposures, each of which has independent phase noise, 
so that in each exposure we have pupil-plane sub-aperture 
piston phases: 

(^A ~ A/'(0,cr^) (11) 
V3S ~ A/'(0,cr^) (12) 
VPc ~AA(0,<7^). (13) 

Applying Equation[7]to this phase noise distribution for 
fcT >> 1 gives the standard deviation of closure-phase (see 
Appendix B for a derivation): 

O"(0cp, temporal) = Cr^ a/S/JcT (14) 

In the short exposure regime, we are dominated by 
atmospheric piston, as in the case with aper ture-masking 
inter ferometry without adaptive optics (e.g. I Tut hill et al.l 
I2OOOI ). In this regime, for typical exposure times At less 
than ~20ms at a 2.2 /im wavelength, or ~50ms at 4/im 
wavelength without adaptive optics or fringe-tracking, we 
can still consider phase errors at third-order with reason- 
able accuracy. By evaluating Equation [7] numerically based 
on Kolmogorov turbulence, we arrive at: 

<T(<7icp,tcmporal) = 0.0177( ^ ) (15) 

to 



which is valid for At < to- This kind of relationship 
also has relevance to long-baseline interferometry in the case 
of measurements where visibilities are measured simulta ne- 
ously. Examples of this ar e MIRC l|Monnier et al.ll200^ 'l or 
PAVO (jlreland et al.ll2008h at the CHARA array. This rela- 
tionship does not apply to scanning beam combiners, where 
fringes can be recorded non- simultaneously depending on 
group delay tracking accuracy. 

2.3 Spatial Closure-Phase Errors 

In this section, we will examine how wavefront phase corru- 
gations affect closure- or kernel- phases. In practice, these 
effects can be calibrated as long as the spatial aberrations re- 
main constant between observations of a target and PSF cal- 
ibrator - but slowly time-variable spatial aberrations (often 
called quasi-static speckles) will result in a less than perfect 
calibration. To most easily compare kernel phase to closure- 
phase, we adopt a factor of l/\/3 scaling to the closure- 
phase, so that adding the three baseline phases is equivalent 
to multiplying by a unit vector (e.g. one of the o rthonormal 
columns of the matrix V from iMartinachdboioh . 

Figure [2] shows a comparison between simulated sparse 
aperture-masking and kernel-phase data analysis for a vari- 
ety of aberration spatial frequencies and aberration ampli- 
tudes. For each amplitude and spatial frequency, the posi- 
tion angle of a sinusoidal aberration was randomly varied 
and the overall RMS kernel-phase computed. It can be seen 
that although both kernel phase and closure-phase appear 
equivalent to first-order, they have quite different responses 
to high-order pupil plane errors. The spatial filtering of an 
aperture- mask means that it can be effectively used at much 
lower instantaneous Strehl ratios than unobstructed-pupil 
kernel-phase, but in a high-Strehl regime, kernel-phase is in 
principle superior. For the 0.35 radians RMS phase error 
case (right hand figure). Equation [7] predicts closure-phases 
approximately 2 times lower than the simulation, possibly 
due to Fourier sampling and windowing effects in the sparse 
aperture-masking pipeline used, and possibly due to effects 
higher than 3rd order in pupil-plane phase. For very high 
instantaneous Strehls, kernel phase in both geometries is 
expected to scale as the cube of the pupil-plane phase error, 
which is (1 — S)^^^ in the Marechal approximation. 

A comparison between imaging with an unobstructed 
aperture and with sparse aperture masks is complicated 
somewhat by the ability to window data, which smooths 
over high spatial frequency aberrations. This gives an in 
principle further advantage to an unobstructed aperture or 
a mask with large holes where the interferogram has a rel- 
atively small spatial extent. An example of a regime where 
fine spatial scale aberrations may dominate phase errors 
post-calibration is when aberrated pupil-plane elements or 
masks shift due to flexure effects. 



2.4 Flat Field Errors 

In sparse aperture masking, many pixels are used to record 
fringes that affect only a small spatial scale. If target and 
calibrator objects are not acquired on the same pixels, then 
the effect of flat fleld errors is to add random phase errors 
across the Fourier plane. A flat fleld error can be modelled 
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Figure 2. The effect of RMS pupil-plane phase errors of f radian (left), 0.7 radians (centre) and 0.35 radians (right) on raw aperture- 
masking Fourier phase (black dot-dashed), full-pupil Kernel-phase (blue) and aperture-masking closure-phase (red) scaled by a factor of 
l/\/3 as described in the text. The pupil geometries are the Keck non-redundant 9-hole mask and the full Keck pupil. 



as multiplication in the image plane by a function that is 
1.0 everywhere plus white-noise with standard deviation ap- 
A typical value of ap is 10~^, arising from a series of flat 
field exposures with a total of 10® photo-electrons per pixel. 
Multiplication by this fiat is equivalent to convolution in 
the Fourier domain, which spreads the power from the zero 
and near-zero spatial frequency components over the full 
Fourier plane. Clearly phase errors will then be proportional 
to uf and inversely proportional to visibility. Numerical sim- 
ulations give the following relationship for closure-phase in 
sparse aperture-masking observations: 

0-(<?!>cp, photon) < 0.3^, (16) 

where V the fringe visibility, referenced to a perfect 
Strehl interferogram of a point source. The constant of ~0.3 
varies between approximately 0.2 and 0.3 for different band- 
pass filters and aperture masks. To ensure that these errors 
are less than 10~^ radians with typical visibilities of 0.3, we 
need ap < 10~^, meaning at least 10® photons per pixel 
recorded when taking flat fields. 

2.5 Bad Pixels 

The existence of bad pixels on an imaging array can often 
destroy sensitivity in traditional imaging over a small por- 
tion of the field of view. By spreading the information over 
many pixels, it may seem that at first glance bad pixels 
would always do significant harm to the information con- 
tent in aperture-masking observations. However, the limited 
Fourier support of this kind of observation, as long as it is 
better than Nyquist sampled, means that bad pixels can be 
very effectively corrected. In simulations, the algorithm be- 
low has proved effective at contrasts beyond lO" for arrays 
far worse than those found at telescopes where aperture- 



masks are installed, meaning that if properly corrected, bad 
pixels are not a cause of kernel-phase errors. 

The principle of this bad pixel correction algorithm is 
to assign the values to the bad pixels so that the power in 
the Fourier domain outside the region of support permitted 
by the pupil geometry is minimised. We will call this region 
of the Fourier plane the zero region Z. We can turn this 
problem into a linear one by realising that the Fourier com- 
ponents corresponding to the set of bad pixel coordinates xt, 
forms a subspace of Z, and we can find a vector of bad pixel 
offsets b to subtract so that the image Fourier transform on 
this subspace is identically zero. 

The first step in this process is to create the matrix 
Bz which maps the bad pixel values onto Z. The measured 
values fz in the Fourier plane region Z are then modelled 
as: 

/z = Bz-b + ez, (17) 

with ez being the remaining Fourier- plane noise. The 
bad pixel adjustments b are then found using the Moore- 
Penrose pseudo- inverse of B: 

b = B+ ■ /z (18) 
= [Wz ■ Bz)-i ■ Bz* ■ fz (19) 

The Moore-Penrose pseudo-inverse can also be found by 
other methods such as singular-value-decomposition rather 
than direct computation of an inverse as in Eguation llQI but 
this method suffices for a relatively small number of bad 
pixels. Although this algorithm is very quick (the matrix 
B^ is pre-computed), the bad pixel correction Equation 1181 
does have to be applied for every frame, with the computed 
values b subtracted off each frame. It can also be used to 
correct for saturated pixels at the core of a PSF. 
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2.6 Photon, Background and Readout Noise 

Where the fringe visibihty is V, the total number of photons 
collected in an interferogram is Np, the number of back- 
ground photons A^i, and the number of holes in the aperture 
mask Nh, the closure-phase error due to photon (shot) noise 
is: 



<T(,^cp,phaton) = --^\/l.5(iVp + TVfc + n^a?,). (20) 

The factor of \/1.5 includes a factor of \/3 due to pho- 
ton noise from three independent baselines making up the 
closure-phase, as well as a factor of \/T/2 due to the shot 
noise power at any non-zero spatial frequency being split 
equally between the real and imaginary parts. The readout 
noise in photon units is (Jro and the number of pixels Up. 
The effect of both readout and background noise is affected 
by the size of the window function used prior to making the 
Fourier transform to compute the visibilities, and this effect 
can be minimised if fringes are directly fit to the data (e.g. 
the SAMP pipeline of iLacour et al.i (,2011|)). 

2.7 Dominant Error Terms 

The most common kind of kernel-phase data taken so far 
has been sparse aperture-masking behind natural guide star 
adaptive optics, particularly at 1.5-2.4 micron wavelengths, 
so we will consider this regime first. We will also consider 
that adequate flat-fields have been taken and bad pixels 
properly corrected. The adaptive optics system only locks 
when there are at least ~100 visible photons per Shack- 
Hartmann lenslet in ~0.01s, or ~10® photons in 100 s. With 
a similar near-infrared and visible photon rate, and a similar 
masking sub-aperture size to a Shack-Hartmann lenslet size. 
Equation [20] would predict a ~0.4 degree photon-limited 
closure-phase uncertainty for a 100 s integration and a 9- 
hole aperture mask. We can use Equation 1141 to predict the 
effect to temporal phase errors: in particularly good see- 
ing, a{^p) could be as low as 0.3 radians (giving a temporal 
phase-noise limited Strehl of ~0.9) and fc could have a value 
of 10 Hz. This would give a temporal phase-noise compo- 
nent to closure-phase uncertainty of ~0.1 degrees. Perhaps 
not surprisingly given how much light an aperture-mask 
blocks, photon noise would dominate in this regime. How- 
ever, for less than ideal seeing conditions and targets which 
are brighter in the infrared, the temporal phase noise dom- 
inates over photon noise. A characteristic "typical seeing" 
predicted closure-phase error for 0.5 radians RMS pupil- 
plane phase error is 0.5 degrees. 

The closure-phase uncertainties predicted here are simi- 
lar to the typical closure-phase uncertainties computed from 
the standard error of the mean of individual observation sets 
in survey papers such as lKraus et al.l ()2008l ). However, it is 
certainly true that the residuals when subtracting closure- 
phases from two point-sources are not always statistically 
consistent with these standard errors. This kind of residual 
is often called a calibration error, where the non-zero closure 
phases described in Section[2]3]are not fully corrected by ob- 
servations of a calibrator star. Typical uncalibrated closure 
phases from the Keck 9 hole aperture mask are 3.5 degrees 
in H and K bands (CH4S and Kp filters), and 7 degrees in 



L band (Lp filter). These non-zero closure phases are con- 
sistent with having quasi-static spatial aberrations of ~0.5 
radians amplitude as the dominant cause in the CH4S and 
Kp filters (e.g. Figure [2]) and atmospheric dispersion in the 
Lp filter (Section [4]). A small change in the cause of these 
non-zero closure phases causes miscalibrations that can be 
larger than the temporal (sub-aperture piston) phase and 
photon noise effects. 



3 CLOSURE-PHASE CORRELATIONS 

One of the more confusing aspects of aperture-masking data 
analysis is knowing what to do wit h a line arly dependent 
set of closure-phases. As described in lKulkarni. (,1989i ). these 
phases may be linearly independent in the case of very low 
signal-to-noise per exposure when the bispectrum is aver- 
aged, but in the high signal-to-noise limit considered here, 
with M non-redundant sub- apertures, there are M{M — 
1)(M - 2)/6 closure-phases but only (M - 1)(M - 2)/2 
linearly independent closure-phases. A redundant aperture 
has an even higher degree of correlation of the bispectrum 
phases. 

Simply choosing an arbitrary independent set of closure- 
phases for the purpose of modelling is not possible without 
a full consideration of the covariance matrix. If one consid- 
ers only the simplest forms of closure-phase errors, namely 
that due to readout-noise, then the problem of modelling the 
covariance matrix is not difficult. However, there are many 
other kinds of errors that can cause correlations between 
closure-phase errors. 

Previous work has either gone to great lengths to diago- 
nalise the measur ed covariance matrix of closure-phase (e.g. 
iKraus et al1l2008h or has made an approximate scaling of fit- 
ting errors to accou nt for the closure-phase correlations (e.g. 
iHinklev et al]|201ll '). The difliculty in any approach based 
on real data is that the sample covariance matrix must be 
modelled, and can not in general be measured completely 
from the data. The reason for this is that where there are 
fewer data frames taken than independent closure-phases, 
the sample covariance matrix is necessarily singular. 

These difficulties are all avoided if rather than consider- 
ing closure-phases as a primary observable, the linear combi- 
nations that make the kernel-phases are seen as the primary 
observables. This has added benefits of being able to extend 
the aperture- mask technique to considering baselines within 
each sub-aperture (consequently extending the usable field 
of view) and using the same language for all adaptive optics 
image analysis that is independent of pupil-plane phase to 
first order. 

Of course, there are many different ways to form a set of 
kernel-phases from a set of closure-phas es, or indeed a lin - 
early independent set of kernel-phases. iMartinachj (|2010l 'l 
suggested that kernel phases should be constructed so that 
only orthonormal linear combinations of Fourier phase are 
considered. However, this does not guarantee statistical in- 
dependence. In the simplest case of a centrally-concentrated 
image limited by photon-noise, the spatial concentration of 
the image variance means that neighbouring Fourier compo- 
nents have highly correlated phase errors. This amounts to a 
contrast loss when considering n-sigma excursions of kernel- 
phase, because just like aperture-masking, the kernel-phase 



© 2013 RAS, MNRAS 000, [T]-?? 



6 M.J. Ireland et al. 



o 
o 



'I I I I ' 

/ ^ 



/ _ _ _ Stat. Ind. Kerphase 




Orthogonal KerPhase. 



II I I I 



.0 0.1 0.2 0.3 0.4 0. 

Separation (arcsec) 



Figure 3. The effect of plioton-noise on Kernel-phase detections, 
based on a simulated photon-limited image with 10® photons 
taken with the unobstructed Keck telescope in the Lp filter. The 
decreased number of photons far from the PSF core means that 
Kernel-phases sensitive to these spatial locations have smaller 
errors, increasing the achievable contrast. Although the Kernel- 
phases in each situation are equivalent, the uncertainties are not 
equivalent, and would require a full covariance matrix in the case 
of the orthogonal kernel-phase. 



technique as described by iMartinachd (|2010l ) has a nearly 
flat contrast limit curve beyond separations of ~ \/ D. How- 
ever, standard imaging can have increasing contrasts as sep- 
arations increase beyond the PSF centre. This apparent loss 
in sensitivity can be regained by properly considering the 
correlation between Fourier phases, as shown below. 



3.1 Statistically-Independent Kernel Phase 

Following from Section 11.11 we will define the matrix that 
transforms the Fourier phase vector $ to the vector of kernel- 
phases Ko- This is an Nk by Nf matrix, where Nk is the 
number of Kernel-phases and Nf is the number of Fourier 
phases. The subscript o indicates that this matrix produces 
an orthonormal set of phase hnear combinations. We can 
compute the sample covariance matrix of kernel phases Ck 
either directly or from the sample covariance matrix of 
Fourier phases C. This matrix can be diagonalised by the 
finite-dimensional spectral theorem: 



D S = Ck = Ko C Ko 



(21) 



The matrix S is then a unitary matrix which allows us 
to construct a set of statistically independent kernel phases 
based on a new kernel-phase operator Kg: 

= Ks ■ <& = S ■ Ko ■ (22) 
As an example of the utility of this approach, I have 



simulated the effects of photon-noise on Kernel-phase con- 
trast limits, as shown in Figure [S] The contrast standard 
deviation was estimated by first estimating the standard 
deviation of each Kernel-phase (i.e. neglecting covariances), 
forming a vector (t{9), then computing the contrast error 
using standard formulae for weighted averages: 



K-$„ 



1/E. 



q2 

-'m,k 

W) 



(23) 
(24) 



Here m# is the model phase divided by the contrast in 
the high-conrast limit, e.g. for a 100:1 brightness ratio com- 
panion, the phase would be approximated well by 0.01m$. 
It is clear that the contrast achieved by considering statisti- 
cally independent kernel-phases defined by Ks is superior to 
the contrast achieved by orthonormal kernel-phases defined 
by Ko, for companions away from the PSF core. 



4 CALIBRATION STRATEGIES 

4.1 Nearest Neighbour Calibration 

The simplest calibration technique is to subtract the kernel- 
phases from a calibrator observed closest to the target in 
time or space. A small extension to this technique (e.g. 
lEvans et al.l[20lil . is to use the average of several calibra- 
tors observed nearby in time, rejecting outlier calibrator ob- 
servations. Outliers are most easily rejected by looking for 
calibrators that when used to calibrate the target, give spuri- 
ously large closure-phases. For Nc calibrators, this amounts 
to calibrator weightings {a^}'^!-^ where each is either 
or 1/Nu, with Nu the number of calibrators used. There are 
however, several weaknesses to this technique: 

(i) With small numbers of calibrator observations, it is 
difficult to avoid subjectivity in the choice to reject partic- 
ular calibrators. 

(ii) For particularly noisy calibrator observations and 
small systematic kernel phases, this process only adds noise. 

(iii) All calibrators are weighted evenly, when the optimal 
weighting of individual calibrators may even be negative. 

(iv) Any astrophysical structure in calibrators, e.g. unde- 
tected faint companions, contributes to any signal in final 
calibrated data. 

The third point may not be obvious, and is illustrated 
in Figure 131 Whenever calibrators are all on one side of the 
calibrator in some space, then optimal calibration may ex- 
trapolate past the position of the calibrators to the target. 
This space may be real (such as zenith distance which pro- 
duces non-zero kernel phases due to dispersion) or a one 
dimensional parameterisation of a hidden variable describ- 
ing a time- variable aberration. This approach is similar to 
the potentially negative weig hting of astrome tric reference 
stars in precision astrometry l|Lazorenkdl2006l) . 



4.2 Optimised Calibrator Weighting 

We will now proceed to define a more optimal set of cal- 
ibrator weightings {asj^j^. This set of calibrator weight- 
ings must minimise the residual closure-phases after fit- 
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Calibrator 1 



Calibrator 2 



Target 



Figure 4. An illustration of a situation where negative weighting 
of a calibrator may be optimal. Dispersion (illustrated by the red 
and blue circles) causes systematic kernel phases such that the 
kernel phases of Calibrator 2 {<j>c2) is the average of kernel- phases 
of the Target (0t) and Calibrator 1 {<f>ci)- The best estimate 
of the kernel-phases caused by dispersion for the Target is then 
20C2 - 



Then our new matrix P2 = U2P1 is a projection ma- 
trix onto S satisfying: 

P2CcpPi^D2, (28) 

Representing the data in this way enables, for example, 
the construction of variables that can be computed by 
the sum over variance-normalised square deviates of a set 
of independent data, without the explicit use of covariance 
matrices. A potential problem with this approach is that 
the sample covariance matrix estimated from the data has 
a rank equal to min(A'ind, Nf,- — 1), where A'fr is the number 
of data frames. Taken at face value, with A^fr < Ni^d, this 
process unreasonably restricts the closure-phases of a model 
of the target to lie on a very limited subspace in the space 
spanned the observed departures from the mean closure- 
phase. For this reason, we take Ccp above to be the weighted 
mean sample covariance matrix of all target and calibrator 
observations weighted by the inverse of the trace of each 
sample covariance matrix. We form the estimated errors of 
the target by: 

PiCtPi = D'2 (29) 

Our data and errors are then transformed to a set of 
kernel-phases x: 



ting a model, without significantly biasing the model fit. 
In this section, we wil l describe this process as applied in 
iKraus fc Ireland! l|2012l ). where the starting point is closure- 
phases rather than kernel-phases^ 

Following Appendix A of iKraus et al.l (|2008l l , we be- 
gin by considering the closure-phases only on a subspace 
spanned by the Mnd linearly independent set of closure- 
phases. Furthermore, we construct a basis vector set on this 
subspace such that the closure-phase covariance matrix is 
diagonal (or nearly so) when projected on to it. To see how 
this is done, first note how closure-phases can be constructed 
linearly from phases: 

(/.CP = T6»p (25) 

The matrix TT^ then projects any set of closure-phases 
onto the set spanned by the linearly independent set of 
closure-phases. This matrix can be diagonalised TT* = 
UiDiUi by a diagonal matrix Di and a unitary matrix 
Ui. The eigenvalues on the diagonal of Di are either or 
1. By considering only the non-zero eigenvectors of Di, we 
can write: 

TT' = P{Pi (26) 

for an Aind x Acp projection matrix Pi. Pi projects 
onto a subspace § spanned by an orthonormal set of linear 
combinations of closure-phases. 

Next, given a closure-phase covariance matrix Ccp, we 
can modify the projection matrix so that it projects onto a 
set of basis vectors for S with a diagonal covariance matrix. 
To accomplish this, we diagonalise the projection of Ccp: 

PiCcpP*i = U\D2U2. (27) 



X = P2(/.cp (30) 
o-^(a;) =diag(D;,) + (31) 

The non-diagonal terms of D'2 are ignored, and any 
values on the diagonal less than the median are set to the 
median. This is a crude method to ensure our statistics are 
reasonably robust, without resorting to studentizing a mul- 
tidimensional distribution. An alternative to this approach 
might be a bootstrapping technique, however in this case 
there is no obvious way to estimate the variables below 
or to account for the error in their estimation. The addi- 
tional uncertainty accounts for calibration errors, to be 
further defined below. 

The next step is to find an optimal linear combination 
of weights {flfej^j, where Nc is the number of possible cal- 
ibrators. By optimal, we mean that we want to maximise 
the likelihood function for {as,} based on a null-model for 
calibrated kernel-phases Xc- 

Xc — xt — T.^^j^akXk (32) 

L{{a,}) = exp(-E, 2^^^)^({a,}), (33) 

where we have explicitly subscripted Xc with i and 
where 7r({afe}) is a Bayesian prior distribution for {ak}. The 
use of a restrictive prior as a regulariser is essential where 
there are many calibrators in use, because if Nc > Nind and 
there is a random error component, then there almost surely 
exists an {a^} such that Xc = 0, subtracting any real astro- 
physical signal. The prior chosen iu iKraus fc Ireland (,201^ 1^ 
was: 

^ This equation as presented in Equation 1 of iKraus fc Ireland! 
||2012| ) was potentially confusing, because the division ii was 
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.(a,)=exp(-^E.4fe4), (34) 
2 cr. [Xt] 

where (cc) is the variance of the i-th component of 
X. This is certainly not the only choice of such a prior, 
but it does have the essential feature of preferring calibra- 
tor weights of zero, and also of reducing the weighting of 
calibrators with large internal sample variances. 

Once an optimal set of weights {ofc} has been found by 
maximising the likelihood function, the uncertainty on the 
calibrated kernel-phases Xc is given by: 

o-1{xc) = o-1{xt) + T.alafixk). (35) 

Note that this neglects any uncertainty in estimating 
the {ofe}. 

Finally, the calibrator observations {x^} no not nec- 
essarily span the space of the hidden parameters causing 
non-zero point-source kernel-phases. For this reason, the ad- 
ditional "calibration error" term in Equation 1311 was it- 
eratively added so that the reduced for the null-model 
was 1.0, i.e.: 

x' = T^Si^^ = 1.0. (36) 

In approx imately half of th e data sets tested in the work 
leading up to iKraus fc Ireland! l|2012l ). no calibration error 
A'^ was needed. With values of the calibrated kernel-phases 
Xc and their errors a{xc) so computed, a model such as 
a bright star plus faint companion or a more complex im- 
age can be fit using least-squares . This is, however, a b iased 
fit just like the LOCI technique (Lafrenie re et al.]|2007l ). be- 
cause the process of computing the weights {uk} partly re- 
moves the binary signal, due to the null model for kernel- 

J ihase in Equation [SS] For this reason, in iKraus fc Ireland! 
2012h . final values of model parameters were computed af- 
ter re-computing the {a^} with the best fit model subtracted 
iteratively from the Xc- 

4.3 Restricted Kernel Phase 

An alternative to the complexity of the calibration strat- 
egy in the previous section is to ignore the kernel-phases 
that require calibration, i.e. those kernel phases that are 
most affected by systematic errors. This leads us to the 
technique of considering Phase Observationally Independent 
of Systematic Errors (POISE) observables as the primary 
observables from high contrast imaging (especially imag- 
ing with aperture-masking), as described below. This tech- 
nique is very similar to the technique of ignoring dominant 
Karhunen-Loeve eigenimages as a means of calibrating m ore 
wide-field point-spread functions jSoummer et al.ll2012l ) 

Following eouation l26l we find a set of kernel-phases yk 
for each image A: by a projection of the Fourier phases Op -. 

Vk = KsOy (37) 

element-by-element division, and the vector /^-norm | • | was used 
without being explicitly described. 



for general Kernel-phase, and 

Vk = Pi</.cp = PiTO^ (38) 

for aperture-masking. These images consist of image 
sets Cj for each PSF calibrator observation j. The sam- 
ple variance for kernel-phase i computed over all calibrators 
could be considered systematic if: 

sl{{Vk'ik})> siiiVk-.keC,}) (39) 
for all calibrator image sets j. 

In the POISE technique, we simply compute the sys- 
tematic error components for each kernel-phase i, and: 

(i) Ignore kernel-phases yi whenever 

5? >/3 {4({yfe: fee })),-. (40) 

A typical value for /3 is 1, which rejects approximately 2 to 
3 out of 28 kernel-phases for 9-hole Keck aperture-masking 
data. 

(ii) Add 5^ to each target observation's uncertainty esti- 
mate for the remaining kernel-phases i. 

This means that the process of calibration is completely 
independent of the target, which was not the case in Sec- 
tion 221 because in that technique calibrator weights were 
chosen to minimise the calibrated target kernel phases. The 
technique requires at least 3 calibrator observations. 

4.4 Imaging with POISE 

To test the POISE algorithm, we will consider the data 
set with the highest contrast detection published in the 
literature so far, which is the K-band detection of struc- 
ture modelled as three compact sources around LkCa 15 
(IKraus fc Ireland! [2011 ). In that paper, an optimised cali- 
brator weighting scheme was used (see Section |4. 21). which 
also meant that the MACIM algorithm (|lreland et al.ll2006l ') 
could be used to create images directly from the closure- 
phases via an OIFITS input file. 

In general, imaging from kernel-phases alone is compu- 
tationally intensive because of the nonlinear relationship be- 
tween the image-plane and Fourier phase. However, in the 
high contrast regime, where interferometric visibility am- 
plitudes are unity within errors, we can approximate the 
Fourier transform F(u) of an image I{x) normalised to a 
total fiux of unity as: 

nn)-l + ./sin(2™.. )/(.)... (41) 
In turn, the phase "I> becomes: 

..(.). /sin(2™.. )/(.)... (42) 

We can consider the image to be made of discrete pixel 
values arranged in a vector p, so that the integral in Equa- 
tion 22] becomes a sum, and the values of Fourier phases cj) 
and Kernel-phases are represented by matrix multiplica- 
tion: 
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Figure 5. A imaging fit to tl ie 2010 Novemb e r dat a set of 
LkCa 15, originally publislied in lKraus &: Ireland! ||2012|) . A uni- 
form prior was used which had a total flux of 2% of the image 
flux, and the final fit at a reduced oi 1.0 contained 1% of the 
image flux, with the remaining 99% contained within the point 
source star at the image centre. 



^ ^ M p 

e^K M p 

^ A - p. 



(43) 
(44) 



This linear approximation to imaging means that min- 
imising kernel-phase subject to a difFerentiable regu- 
lariser can be rapidly computed using a gradient descent 
method. The result of this linear approach to imaging with 
the Maximum Entropy regulariser can be seen in Figure [SI 
where the resolved structures contain 1% of the total system 
flux and the reduced of the image is 1.(0. Quasi-static 
spatial aberrations in this case do not limit the signal-to- 
noise in the final image, other than reducing the number 
of useable observables by 4%. For this ki nd of observation , 
spatially-filtering the input wavefront fe.g. lHubv et al ]|2012l : 
Ijovanovic et al.l2012l ') would not improve the achievable con- 
trast. 

The image in Figure [5] is cosmeticall y at least as good 
as that shown in l|Kraus fc Ireland! [20l3 ). but comes with 
the significant benefit that the calibration process does not 
directly affect the image: the POISE observables are inde- 
pendent of the calibrator observations. 



The image reconstruction code can be found at 
|http://code. google. com/p/pysco the repository where all 
code in this paper is intended to go after translation to python. 



Table 1. A comparison between a three additional point-source 
fit to 2010 November K' sparse aperture mask data using linear 
combinations of calibrator observations teraus fc Ireland! !20l3 . 
KI12) and using the POISE observables. Parameters are separa- 
tion (p), position angle (6) and magnitude difference with respect 
to the primary (Am). When adding uncertainties in quadrature, 
differences are always consistent within 2-(t and in 7 out of 9 cases 
within 1-cr. 



Parameter 


KI12 


POISE 


pi (mas) 


67.0±3.2 


65.1±3.1 


01 (deg) 


12.3±2.8 


10.9±2.9 


Ami 


7.40±0.19 


6.89±0.18 


P2 (mas) 


64.4±1.5 


62.6±1.9 


92 (deg) 


334.8±1.5 


333.4±2.5 


Am2 


6.59±0.09 


6.36±0.11 


p3 (mas) 


82.5±2.4 


78.0±4.1 


03 (deg) 


302.3±1.5 


302.3±2.8 


Ama 


7.06±0.12 


7.02±0.18 



5 CONCLUSIONS 

Aperture-mask interferometry has proven to be a power- 
ful technique to recover high contrast, asymmetric informa- 
tion at the diffraction limit of large telescopes. The reason 
for this success is the ability for closure-phase, a kind of 
kernel-phase, to give an observable largely independent of 
time- variable aberrations. I have described many of the key 
sources of phase errors in this technique, as well as sev- 
eral strategies for mitigating them. Of note is the Phase 
Observationally Independent of Systematic Errors (POISE) 
observables, which are a subset of all possible linear com- 
binations of closure-phases. Observations of calibrator stars 
inform which linear combinations of phases constitute the 
POISE observables, but the analysis of the target observa- 
tions is performed quite independently of the calibrator ob- 
servations, leading to a more robust calibration method. 

The generalisation of the aperture-mask technique to 
full pupil images shows great promise in the form of the full 
pupil kernel-phase observable. Simulations show that pupil- 
plane phase errors higher than third-order affect full pupil 
kernel-phase more than aperture-mask kernel-phase, mean- 
ing that full-pupil kernel phase will likely be restricted to 
observations of moderately high Strehl observations. 

The analysis presented here has implicitly involved only 
a monochromatic PSF from an imaging system. A future 
study of the effect of very broad bandwidths is needed. More 
importantly, an extension of this technique to work for the 
simultaneous wavelength-dispersed images formed by an in- 
tegral field unit could be very powerful. The scaling of PSF 
with wavelength as a speckle suppression technique could be 
equally- well applied to observables in the Fourier domain as 
it has been in image-plane analyses. 
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APPENDIX A: THIRD-ORDER BISPECTRUM 
EXPANSION 

We will begin by writing the combination of Equations |4] 
and [S] explicitly: 



hABC = (1 + i{VB 


- fA) 




- ^aY 


X {l + i{ipc - 


- ifB) 




-H>bY 


X (1 + i{ipA - 


- V>c) 




- vcY 



(Al) 



The 0th order terms in the A(f)s are trivially collected 
as 1, and the 1st order terms clearly cancel to give 0. The 
second order terms are: 



{(pB — 'PAf + (<^C — "^bY + {fA — 'PCY 



- [ (tpB - ^pa) ■ {pc - 'Pb) + {pc - Pb) ■ i^A - pc) 
+ ((^A - ^c) ■ ifB - Va) ] (A2) 



Moving from this equation to Equation |6] requires the 
substitution of Equations |S] through 1101 as well as a recog- 
nition of the following classes of trivial identities: 



((fiB — fA) = ifB — fA) 



(A3) 
(A4) 



The 3rd order terms of Equation I All are collected (after 
minor simplification of the coefficient 1/2 terms) as: 



9(6aSc) ~ - g[ ifB - (^a)^ + ifC - ^bY + [fA - fcY 



+ 2^ifB - fA) ■ ifB ~ fAY 



+ (ipc - fB) ■ ifC - fBY + if A - fc) ■ if A - fcY 
- [ifB ~ fA) ■ ifC ~ fB) ■ if A - fc). (A5) 

Again, Equation [7] follows after substitution of Equa- 
tions [8] through [10] as well as applying trivial identities such 
as ESI and m 



APPENDIX B: TEMPORAL PHASE ERRORS 

In applying Equation [7] to temporal phase errors, we write 
the instantaneous values of ipA, fB and ipc as random vari- 
ables Xa, Xb and Xc respectively, which take a new ran- 
dom value at A*' statistically independent time steps. We can 
then write: 



Var(0cp) = ^Var((<^i3 - f'J^ + (^'^ - f'^Y + if'A ~ 'P'o^, 

(Bl) 



-Var((Xs - XaY 



-VMX\Xb 



+ {Xc-XBf + {Xa-Xc)'') 
1 

^ Tn 

— XbXq + XaXq 

^ 34 
N 



XaXb + 
- X\Xc) 



X%Xc 



(B2) 

(B3) 
(B4) 



Here Var represents the variance of a quantity, which 
in this special case of quantities of zero mean, is simply the 
expectation of the square. The approximately equals sign 
(«) in Equation IB2I is used because we are ignoring the pis- 
ton subtraction, applicable only for A'' >> 1 (and with an 
error of order N~^). Each of the variables Xa, Xb and Xc 
axe independent Gaussian variables with mean and stan- 
dard deviation g^, so their moments are standard results, 
and the expectation of a product of their moments is simply 
the product of the expectation of their respective moments. 
The variance on the right hand side of Equation IB3I can be 
thus be simply but tediously evaluated as the sum over 36 
mutual covariances to give a value of 12cr^. Finally, Equa- 
tion [2] follows directly from Equation IB41 noting that the 
number of independent phase samples N = fcT. 
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